18 research outputs found
Deriving Verb Predicates By Clustering Verbs with Arguments
Hand-built verb clusters such as the widely used Levin classes (Levin, 1993)
have proved useful, but have limited coverage. Verb classes automatically
induced from corpus data such as those from VerbKB (Wijaya, 2016), on the other
hand, can give clusters with much larger coverage, and can be adapted to
specific corpora such as Twitter. We present a method for clustering the
outputs of VerbKB: verbs with their multiple argument types, e.g.
"marry(person, person)", "feel(person, emotion)." We make use of a novel
low-dimensional embedding of verbs and their arguments to produce high quality
clusters in which the same verb can be in different clusters depending on its
argument type. The resulting verb clusters do a better job than hand-built
clusters of predicting sarcasm, sentiment, and locus of control in tweets
Recommended from our members
Collecting Semantic Data by Mechanical Turk for the Lexical Knowledge Resource of a Text-to-Picture Generating System
WordsEye is a system for automatically converting natural language text into 3D scenes representing the meaning of that text. At the core of WordsEye is the Scenario-Based Lexical Knowledge Resource (SBLR), a unified knowledge base and representational system for expressing lexical and real-world knowledge needed to depict scenes from text. To enrich a portion of the SBLR, we need to fill out some contextual information about its objects, including information about their typical parts, typical locations and typical objects located near them. This paper explores our proposed methodology to achieve this goal. First we try to collect some semantic information by using Amazon’s Mechanical Turk (AMT). Then, we manually filter and classify the collected data and finally, we compare the manual results with the output of some automatic filtration techniques which use several WordNet similarity and corpus association measures
Recommended from our members
Data Collection and Normalization for Building the Scenario-Based Lexical Knowledge Resource of a Text-to-Scene Conversion System
WordsEye is a system for converting from English text into three-dimensional graphical scenes that represent that text. It works by performing syntactic and semantic analyses on the input text, producing a description of the arrangement of objects in a scene. At the core of WordsEye is the Scenario-Based Lexical Knowledge Resource (SBLR), a unified knowledge base and representational system for expressing lexical
and real-world knowledge needed to depict scenes from text. This paper explores information collection methods for building the SBLR, using Amazon’s Mechanical Turk (AMT) and manual normalization of raw AMT data. The paper follows with manual review of existing relations in the SBLR and classification of the AMT data into existing and new semantic relations. Since manual annotation is a time-consuming and expensive approach, we also explored the use of automatic normalization of AMT data through log-odds and log-likelihood ratios extracted from the English Gigaword corpus, as well as through WordNet similarity measures
Recommended from our members
Collecting Spatial Information for Locations in a Text-to-Scene Conversion System
We investigate using Amazon Mechanical Turk (AMT) for building a low-level description corpus and populating VigNet, a comprehensive semantic resource that we will use in a text-to-scene generation system. To depict a picture of a location, VigNet should contain the knowledge about the typical objects in that location and the arrangements of those objects. Such information is mostly common-sense knowledge that is taken for granted by human beings and is not stated in existing lexical resources and in text corpora. In this paper we focus on collecting objects of locations using AMT. Our results show that it is a promising approach
Recommended from our members
Annotation Tools and Knowledge Representation for a Text-To-Scene System
Text-to-scene conversion requires knowledge about how actions and locations are expressed in language and realized in the world. To provide this knowlege, we are creating a lexical resource (VigNet) that extends FrameNet by creating a set of intermediate frames (vignettes) that bridge between the high-level semantics of FrameNet frames and a new set of low-level primitive graphical frames. Vignettes can be thought of as a link between function and form – between what a scene means and what it looks like. In this paper, we describe the set of primitive graphical frames and the functional properties of 3D objects (affordances) we use in this decomposition. We examine the methods and tools we have developed to populate VigNet with a large number of action and location vignettes
Data collection and normalization for building the scenario-based lexical knowledge resource of a text-to-scene conversion system
WordsEye is a system for converting from English text into three-dimensional graphical scenes that represent that text. It works by performing syntactic and semantic analyses on the input text, producing a description of the arrangement of objects in a scene. At the core of WordsEye is the Scenario-Based Lexical Knowledge Resource (SBLR), a unified knowledge base and representational system for expressing lexical and real-world knowledge needed to depict scenes from text. This paper explores information collection methods for building the SBLR, using Amazon’s Mechanical Turk (AMT) and manual normalization of raw AMT data. The paper follows with manual review of existing relations in the SBLR and classification of the AMT data into existing and new semantic relations. Since manual annotation is a time-consuming and expensive approach, we also explored the use of automatic normalization of AMT data through logodds and log-likelihood ratios extracted from the English Gigaword corpus, as well as through WordNet similarity measures. 1